Track new words and corrections to en-US dictionary
Categories
(Core :: Spelling Checker: en-US Dictionary, enhancement)
Tracking
()
People
(Reporter: flod, Unassigned)
References
Details
Using this bug to track and discuss requests for new words in the Mozilla en-US dictionary.
- Try to provide information on the terms you want to add, in particular references to external sources that confirm the usage of the term (e.g. Merriam-Webster or Oxford online dictionaries).
- Include all possible forms, e.g. plural and genitive for nouns, different tenses for verbs.
- Names of companies or people should not be included.
The goal is to add words that are commonly used, not all the words, as that might have a negative impact on performances (and will impact the installer size).
Comment 1•5 months ago
|
||
The list of missing words from the Wiktionary English dictionary is available at https://tdulcet.github.io/Missing-Words/ and automatically updated monthly. With the default options (only words without numbers or symbols and with a Wikipedia page), this currently includes 7,707 words for consideration. It can also be downloaded in TSV format. See Bug 1811451 comment 11 and below for more information. Since then, I have added several new options, including to sort the words, disable the normalization of words before checking if they are already in the dictionary and show words with one or more forms not in the dictionary. Feedback is welcome.
The missing words from the Ispell small and medium American English dictionaries are available in Bug 1811451 comment 2, which includes 13,997 words for consideration. I am always looking for high quality wordlists or other ideas about how we could systematically find those words that should be included in the Mozilla en-US dictionary.
TikTok (correctly capitalization of the most popular video app in the world).
YouTube is already present in the dictionary.
Reporter | ||
Comment 3•5 months ago
|
||
See first comment
Names of companies or people should not be included.
We should avoid adding more company names, because they don't last. Firefox's personal dictionaries can solve that.
Comment 4•5 months ago
|
||
urbanism merriam-webster, wiktionary
Comment hidden (advocacy) |
Reporter | ||
Comment 7•4 months ago
|
||
The last duped bug has "unpleased", which seems to be mostly British
https://www.collinsdictionary.com/us/dictionary/english/unpleased
https://dictionary.cambridge.org/dictionary/english/unpleased
Reporter | ||
Comment 8•4 months ago
|
||
Bug 1550932: taekwondo
Comment 9•4 months ago
|
||
We can have a paywall but things can't be paywalled??
And we can have a leak but we can't have someone who does so, a leaker? (plus it's plural.)
Comment 10•3 months ago
|
||
"indicia" come on. this entire setup where a german native speaker controls the dictionary needs to end if you want us to pretend firefox is worth fixing
Reporter | ||
Comment 11•3 months ago
|
||
(In reply to Jack Laxson from comment #10)
"indicia" come on. this entire setup where a german native speaker controls the dictionary needs to end if you want us to pretend firefox is worth fixing
You might want to take a look at https://bugzilla.mozilla.org/page.cgi?id=etiquette.html
Flagging missing words in this bug is fine, the rest of the comment is completely unnecessary.
Comment hidden (off-topic) |
Comment hidden (off-topic) |
Comment 14•2 months ago
|
||
TypeScript
: https://en.wiktionary.org/wiki/TypeScript
JavaScript/M
(coming from mozilla-specific.txt(?)) is already there.
Adding TypeScript/M
along existing and correct -- yet specialised and now historical -- typescript/MS
would possibly save some confusion when referring to the programming language that is twelve years old now, yet is getting it's name marked as a spelling error. (Anecdotally, I had to double-check and consequently unlearn writing Typescript
everywhere; I really thought it differs from JS in casing, only because I was blindly trusting my spellchecker.)
By the way, is maintaining these additions in the Firefox' code base (i.e., NOT through the "parent" SCOWL) really the optimal solution here? Or is SCOWL fetching additions from here? I see that JavaScript
is included there already: http://app.aspell.net/lookup?dict=en_US;words=JavaScript so maybe there is no reason to keep it in "Mozilla specific" any more?
Comment 15•2 months ago
|
||
Advocacy.
We should avoid adding more company names, because they don't last.
I second that this sentiment seems to be potentially very harmful for Firefox user base. Just think for a while: how many users today, and in near future, will likely type TikTok
in some spell-checked field?
Firefox's personal dictionaries can solve that.
So are we really expecting that our users will manually maintain all "new" terms, even those that has been constantly appearing in world news for several years now?
The current .dic
file contains charming echoes of long defunct companies or products, some of which operated for only a short duration, yet are still included - which is a good thing, if you ask me. I'd bet that if we counted how many times tech journalists and enthusiasts typed words like GameCube, FrontPage, ColdFusion, CompuServe, ChatZilla, DivX, BlinkList, Macromedia, and Compaq combined in the past year, that sum would be smaller than the count for TikTok alone (however sad and bitter that reality may be).
Tangential off-topic:
[...] this entire setup where a german [sic] native speaker controls the dictionary needs to end if you want us to pretend firefox [sic] is worth fixing
Amusingly, I am mostly using British dictionary that is maintained by a lone Portuguese person, and from what I can tell, he is doing pretty good work.
Reporter | ||
Comment 16•2 months ago
|
||
convolve, convolves, convolving, convolved
https://www.merriam-webster.com/dictionary/convolve
externality
https://www.merriam-webster.com/dictionary/externality
Comment 17•2 months ago
|
||
runnable
https://www.merriam-webster.com/dictionary/runnable
https://en.wiktionary.org/wiki/runnable
Maybe consider unrunnable and nonrunnable, though I don't have a strong opinion about that – not much sources outside Wiktionary.
Comment 18•25 days ago
|
||
sequiturs, non-sequiturs
https://www.merriam-webster.com/dictionary/non%20sequiturs
https://dictionary.cambridge.org/us/dictionary/english/non-sequitur
Singular non sequitur and non-sequitur are marked as correct, just the plural forms are missing. Even sequitur by itself is also marked as correct even though it's rarer; I was unsure how fossil words (Wikipedia) that are archaic/obsolete/regional/non-English outside of set phrases, should be treated, given the impracticality of detecting whether they're in a valid phrase.
Comment 19•25 days ago
|
||
As for sequiturs then maybe also consider sequuntur (the proper Latin plural), and sequituri (improper but reportedly accepted (*)):
https://en.wiktionary.org/wiki/non_sequitur#:~:text=The%20legitimate,misformed.
By the way, the word misformed from there also does not pass the spell check.
(*) I see a parallel to "octopi" that is also wrong amalgamation of Greek root with Latin suffix, but accepted even by en-US dict. Nobody like hyper-correct octopodes, though.
Comment 20•17 days ago
|
||
I'm only gonna include one source, cause I've got a lot of entries. All of these come from my personal dictionary that I've added over the course of time that I've used Firefox.
conclusionary
https://www.merriam-webster.com/dictionary/conclusionary
eyewear
https://www.merriam-webster.com/dictionary/eyewear
headwear
https://www.merriam-webster.com/dictionary/headwear
footwear
https://www.merriam-webster.com/dictionary/footwear
pareidolia
https://www.merriam-webster.com/dictionary/pareidolia
overengineer, overengineered, overengineering,
https://www.merriam-webster.com/dictionary/overengineer
overgeneralization - implied by overgeneralize
https://www.merriam-webster.com/dictionary/overgeneralization
functionalize, functionalized, functionalizing, functionalizes
https://www.merriam-webster.com/dictionary/functionalize
luminance
https://www.merriam-webster.com/dictionary/luminance
chrominance
https://www.merriam-webster.com/dictionary/chrominance
degauss, degaussed, degausser, degaussers
https://www.merriam-webster.com/dictionary/degauss
chroma
https://www.merriam-webster.com/dictionary/chroma
liminal, liminalities, liminality
https://www.merriam-webster.com/dictionary/liminal
countershading
https://www.merriam-webster.com/dictionary/countershading
refried
https://www.merriam-webster.com/dictionary/refried
memetic
https://www.merriam-webster.com/dictionary/memetic
sclera - this one is surprising
https://www.merriam-webster.com/dictionary/sclera
deplatform, deplatformed, deplatforms, deplatforming
https://www.merriam-webster.com/dictionary/deplatform
lagomorph, lagomorphs
https://www.merriam-webster.com/dictionary/lagomorph
faceplate, faceplates
https://www.merriam-webster.com/dictionary/faceplate
Comment 21•17 days ago
|
||
Also, while we're on the topic, can Firefox be the first to recognize that a capitalization error is a grammar mistake and not a spelling mistake?
Comment 22•3 days ago
|
||
Neurodiversity
https://www.merriam-webster.com/dictionary/neurodiversity
https://www.oxfordlearnersdictionaries.com/definition/english/neurodiversity
https://www.dictionary.com/browse/neurodiversity
https://dictionary.cambridge.org/dictionary/english/neurodiversity
https://www.collinsdictionary.com/dictionary/english/neurodiversity
Neurodiverse
https://www.merriam-webster.com/dictionary/neurodiverse
https://www.oxfordlearnersdictionaries.com/definition/english/neurodiverse
https://www.dictionary.com/browse/neurodiverse
https://dictionary.cambridge.org/dictionary/english/neurodiverse
https://www.collinsdictionary.com/dictionary/english/neurodiverse
Neurodivergent
https://www.merriam-webster.com/dictionary/neurodivergent
https://www.oxfordlearnersdictionaries.com/definition/english/neurodivergent
https://www.dictionary.com/browse/neurodivergent
https://dictionary.cambridge.org/dictionary/english/neurodivergent
https://www.collinsdictionary.com/dictionary/english/neurodivergent
This concept is widely acknowledged by now.
Comment 23•2 days ago
|
||
stereotypical
sexually
unlikeable
authenticator
Comment hidden (off-topic) |
Description
•